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Abstract — There has been substantial progress recently in 
understanding toy problems of purely implicit signaling. These 
are problems where the source and the channel are implicit — 
the message is generated endogenously by the system, and the 
plant itself is used as a channel. In this paper, we explore how 
implicit and explicit communication can be used synergistically 
to reduce control costs. 

The setting is an extension of Witsenhausen's counterexample 
where a rate-limited external channel connects the two con- 
trollers. Using a semi-deterministic version of the problem, we 
arrive at a binning-based strategy that can outperform the best 
known strategies by an arbitrarily large factor. 

We also show that our binning-based strategy attains within a 
constant factor of the optimal cost for an asymptotically infinite- 
length version of the problem uniformly over all problem 
parameters and all rates on the external channel. For the scalar 
case, although our results yield approximate optimality for each 
fixed rate, we are unable to prove approximately-optimality 
uniformly over all rates. 

I. Introduction 

In his layered approach to design of decentralized control 
systems [1], Varaiya dedicates an entire layer for coordinating 
the actions of various agents. The question is: how can the 
agents build this coordination? 

The most natural way to build coordination is through 
communication. To begin with, let us assume that the source 
and the channel have been specified explicitly. Even with this 
simplification, the general problem of multiterminal informa- 
tion theory has proven to be hard. The community therefore 
resorted to building a bottom-up theory that starts from Shan- 
non's toy problem of point-to-point communication [2]. The 
insights and tools obtained from this toy problem have helped 
immensely in the continuing development of multiterminal 
information theory. 

A more accurate model of a dynamic control system is 
where the source can evolve with time, reflecting the impact 
of random perturbations and control actions. A counterpart of 
Shannon's point-to-point toy problem that models evolution 
due to random perturbations is a problem of communicating 
an unstable Markov source across a channel. The problem is 
reasonably well understood [3]-[7], and again, building on 
the understanding for this toy problem, the community has 
begun exploring multicontroller problems [8], [9]. 

Do the above models encompass the possible ways of 
building coordination? Because these models are motivated 
by an architectural separation of estimation and control, 



they do not model the impact of control actions in state 
evolution^ Is this aspect important? Indeed, in decentralized 
control systems, it is often possible to modify what is to be 
communicated before communicating it. But at times, it is 
also often unclear what medium to use for communicating 
the message [12, Ch. 1]. That is, the sources and the channels 
may not be not as explicit as assumed in traditional com- 
munication models. To understand this issue, we informally 
define implicit communication to be one of the following two 
phenomena arising in decentralized control: 

• Implicit message: the message itself is generated en- 
dogenously by the control system. 

• Implicit channel: the system to be controlled is used as 
a channel to communicate. 

The first phenomenon, that of implicit messages, poses an 
intellectual challenge to information theorists. How does one 
communicate a message that is endogenously generated, and 
hence can potentially be affected by the policy choice? 

The second phenomenon, that of viewing the plant as an 
implicit communication channel, is challenging from a con- 
trol theoretic standpoint. The control actions now perform a 
dual role — control of the system {i.e. minimizing immediate 
costs), and communication through the system (presumably 
to lower future costs). 



x ° — K+H — 1 



U 2 



'AT(0,1) 

Cost = fc 2 E [u 2 ] + E [(xi - x x 
(a) 



Implicit message 



-Xq 



r 



Implicit channel 



Z 



E [ui\ < P MMSE = E | ( Xl - xiY 
(b) 



Fig. 1. The Witsenhausen counterexample, shown in (a) is the minimalist 
toy problem that exhibits the two notions of implicit communication, shown 
in (b), which is an equivalent representation [13]. 

'Communication has also been used to build coordination by generating 
correlation between random variables [10], [11]. 



The counterpart of Shannon's point-to-point problem in 
implicit communication is a decentralized two-controller 
problem called Witsenhausen's counterexample [14] shown 
in Fig. [T] The message, state x\, is implicit, because it 
can be affected by the input u% of the first controller. The 
channel is implicit because the system state itself is used to 
communicate the message. 

Despite substantial efforts of the community, the counterex- 
ample remains unsolved, and due to this the community could 
not build on the problem to address larger control networks of 
this nature. Recently, however, we showed that using the in- 
put to quantize the state (complemented by linear strategies) 
attains within a constant factor of the optimal cost uniformly 
over all problem parameters for the counterexample and 
its vector extensions [13], [15]. Building on this provable 
approximate-optimality we have been able to obtain similar 
results for many extensions to the counterexampl^] [12], 
[18]-[21]. 

When is it useful to communicate implicitly? To understand 
this, Ho and Chang [22] introduce the concept of partially- 
nested information structures. Their results can be interpreted 
in the following manner: when transmission delay across a 
noiseless, infinite-capacity external channel is smaller than 
the propagation delay of implicit communication, there is 
no advantage in communicating implicitly^] The system 
designer always has the engineering freedom to attach an 
external channel. Can this external channel obviate the need 
to consider implicit communication? 

In practice, however, the channel is never perfect. In [12, 
Ch. 1], we compare problems of implicit and explicit commu- 
nication where the respective channels are noisy. Assuming 
that the weights on quadratic costs on inputs and reconstruc- 
tion are the same for implicit and explicit communication, 
we show that implicit communication can outperform various 
architectures of explicit communication by an arbitrarily large 
factor! The gain is due to implicit nature of the messages — 
the simplified source after actions of the controller can be 
communicated with much greater fidelity for the same power 
cost. 

So an external channel should not be thought of as a substi- 
tute for implicit communication. But if an external channel is 
available, how should it be used in conjunction with implicit 
communication? To examine this, we consider an extension 
of Witsenhausen's counterexample (shown in Fig. [2} where 
an external channel connects the two controllers. A special 
case when the channel is power constrained and has ad- 
ditive Gaussian noise has been considered by Shoarinejad 
et al [25] and Martins [26]. Shoarinejad et al observe that 
when the channel noise variance diverges to infinity, the 
problem approaches Witsenhausen's counterexample, while 

2 Approximate-optimality results of this nature have proven useful in 
information theory as well — building on smaller problems [16], significant 
understanding has been gained about larger systems [17]. 

3 The same conclusion is drawn in work of Rotkowitz an Lall [23] (as an 
application of quadratic-invariance) and that of Yiiksel [24] in more general 
frameworks. 



linear strategies are optimal in the limit of zero noise. Martins 
considers the case of finite noise variance and shows that in 
some cases, there exist nonlinear strategies that outperform 
all linear strategies. 



In Section HI we provide an improvement over Mar- 
tins's strategy based on intuition obtained from a semi- 



deterministic version of the problem. In Section IV we show 
that our strategy can outperform Martins's strategy by an 
arbitrarily large factor. Because we interpret the problem as 
communication across two parallel channels — an implicit 
one and an explicit one — our strategy ensures that the 
information on implicit and explicit channels is essentially 
orthogonal. Without the implicit channel output, the message 
our strategy sends on the explicit channel would yield little 
information about the state. But the observations on the two 
channels jointly reveal a lot more about the state. This elim- 
inates a redundancy in Martins's strategies where the same 
message is duplicated over the implicit and explicit channels. 
In this sense, our results here also provide a justification for 
the utility of the concept of implicit communication. 

For simplicity, we assume a fixed-rate noiseless external 
channel for most of the paper. In Section |V-A| our binning 
strategy is proved to be approximately optimal for all problem 
parameters and all rates on the external channel for an 
asymptotic vector version of the problem. In Section |V-B| 
using tools from large-deviation theory and KL-divergence, 
we obtain a lower bound on the costs for finite vector-lengths. 
Using this lower bound, we show that our improved strategy 
is within a constant factor of optimal for any fixed rate R ex on 
the external channel for the scalar case. However, we do not 
yet have an approximately-optimal solution that is uniform 
over external channel's rate — the ratio of upper and lower 
bounds diverges to infinity as R ex — > oo. We conclude in 
Section |VI] 

II. Notation and problem statement 

Vectors are denoted in bold, with a superscript to denote 
their length (e.g. x m is a vector of length m). Upper case is 
used for random variables or random vectors (except when 
denoting power P), while lower case symbols represent their 
realizations. Hats (~) on the top of random variables denote 
the estimates of the random variables. The block-diagram for 
the extension of Witsenhausen's counterexample considered 
in this paper is shown in Fig. [2] S m (r) denotes a sphere of 
radius r centered at the origin in m-dimensional Euclidean 
space W n . Vol (A) denotes volume of the set A in W n . 

A control strategy is denoted by 7 = (71,72), where 
7i is the function that maps the observations at C, to the 
control inputs. The first controller observes y™ = x™ and 
generates a control input u™ that affects the system state, 
and a message W <E {0, 1, ... , 2 mR — 1} (that can also be 
viewed as a control input) for the second controller that is 
sent across a parallel channel. 

The second controller observes y™ = x™ + z m , where z m 
is the disturbance, or the noise at the input of the second 
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Fig. 2. The scalar version of the problem of implicit and explicit 
communication considered in this paper. An external channel connects the 
two controllers. In absence of implicit communication, the optimal strategy is 
linear. In absence explicit communication, an approximately-optimal strategy 
is quantization. Therefore, a natural strategy for this problem of implicit and 
explicit communication, proposed in [26], is to communicate linearly over 
the external channel, and use quantization over the implicit channel. Fig. 13] 
shows that our binning-based synergistic strategy can outperform this natural 
strategy by an arbitrarily large factor. 
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Fig. 3. A semi-deterministic model for the toy problem of implicit and 
explicit communication. An external channel (for this example, of capacity 
two bits) connects the two controllers. The case <Tq > 1 is shown in (a), 
while <Tq < 1 is shown in (b). 



controller. It also observes perfectly the message W sent by 
the first controller. The total cost is a quadratic function of 
the state and the input given by: 
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7i(Xq 1 ) — u™ where 



J (7) (x™,z m ) 

where u™ = 71 (xq 1 ), xJj 
u™ = 72 (x™ + 71 (x™) + z m ). The cost expression includes 
a division by the vector-length m to allow for natural 
comparisons between different vector-lengths. 

Subscripts in expectation expressions denote the random 
variable being averaged over (e.g. Ex™,z™ [•] denotes aver- 
aging over the initial state X™ and the test noise Zg?). 

III. A SEMI-DETERMINISTIC MODEL 

We extend the deterministic abstraction of Gaussian com- 
munication networks proposed in [17], [27] to a semi- 
deterministic model for our problem of Section [TTJ 

• Each system variable is represented in binary. For in- 
stance, in Fig. [3] the state is represented by fei&2&3-&4^5, 
where bi is the highest order bit, and 65 is the lowest. 

• The location of the decimal point is determined by the 
signal-to-noise ratio (SNR), where signal refers to the 
state or input to which noise is added. It is given by 
[log 2 (SNR)\ — 1, Noise can only affect the bit before 
the decimal point, and the bits following it that is, 63, 
64 and &5. 

• The power of a random variable A, denoted by pow(A) 
is defined as the highest order bit that is 1 among 
all the possible (binary-represented) values that A 



can take with nonzero probability^] For instance, if 
A e {0.01,0.11,0.1,0.001}, then A has the power 
pow(A) = 0.1. 

• Additions/subtractions in the original model are replaced 
by bit-wise XORs. Noise is assumed to be iid Ber(0.5). 

• The capacity of the external channel in the semi- 
deterministic version is the integer part (floor) of the 
capacity of the actual external channel. 

We note here that unlike in the information-theoretic deter- 
ministic model of [17], the binary expansions in our model 
are valuable even after the decimal point (below noise level). 
Indeed, the model is not deterministic as random noise is 
modeled in the system^] This move from deterministic to 
semi-deterministic models is needed in decentralized control 
because one of the three roles of control actions is to improve 
the estimability of the state when observed noisily (the other 
two roles being control and communication). Since smart 
choices of control inputs can reduce the state uncertainty 
in the LQG model, a simplified model should allow for this 
possibility as well (the matter is discussed at length in [12]). 

The semi-deterministic abstraction for our extension of 
Witsenhausen's counterexample is shown in Fig. [3] The orig- 
inal cost of k 2 u\ + x\ now becomes k 2 pow(ui) + pow{x2)- 
As in Fig. [2] the encoder for this semi-determinisitic model 
observes xo noiselessly. Addition is represented by XORs, 
with the relative power of the terms to be added deciding 
which bits are affected. For instance, in Fig. [5] the power of 
the encoder input is sufficient to only affect the last bits of 

4 We note that our definition of pow(A) is for clarity and convenience, 
and is far from unique in amongst good choices. 

5 An erasure-based deterministic model for noise can instead be used. This 
model also has the same optimal strategies. 



the state Xq. The noise bits are assumed to be distributed iid 
Ber(0.5). 

A. Optimal strategies for the semi-deterministic abstraction 

We characterize the optimal tradeoff between the input 
power pow(u\) and the power in the MMSE error pow(x2). 
The minimum total cost problem is a convex dual of this 
problem, and can be obtained easily. Let the power of Xq, 
pow(xo) be <7q. The noise power is assumed to be 1. 

Case 1: <Tq > 1. 
This case is shown in Fig. |5Jb). The bits b\ , 62 are com- 
municated noiselessly to the decoder, so the encoder does 
not need to communicate them implicitly or explicitly. The 
external channel has a capacity of two bits, so it can be used 
to communicate two of 63,64 and 65. It should be used to 
communicate the higher-order bits among those corrupted by 
noise, i.e., bits 63, 64. The control input u x should be used to 
modify the lower-order bits (bit 65 in Fig. [3j. In the example 
shown, if P < 0.01, MMSE = 0.01, else MMSE = 0. 

In this case (shown in Fig.[3jb)), the signal power is smaller 
than noise power. All the bits are therefore corrupted by 
noise, and nothing can be communicated across the implicit 
channel. In order for the decoder to be able to decode any bit 
in the representation of x%, it must either a) know the bit in 
advance (for instance, encoder can force the bit to 0), or b) 
be communicated the bit on the external channel. Since the 
encoder should use minimum power, it is clear that the most 
significant bits of the state (bits bi, 62 in Fig.|5Jb)) should be 
communicated on the external channel. The encoder, if it has 
sufficient power, can then force the lower order bits (63 , 64 in 
Fig. [3p3)) of xi to zero. In the example shown in Fig. [3|b), 
if P < 0.001, MMSE = 0.001, else MMSE = 0. 

B. What scheme does the semi-deterministic model suggest 
over reals? 

A linear communication scheme over the external channel 
would correspond to communicating the highest-order bits of 
the state. The scheme for the semi-deterministic abstraction 



(Section III 1 communicates instead the highest order bits that 



are at or below the noise level. This suggests that the external 
channel should not be used in a linear fashion — the higher 
order bits are already known at the decoder. Instead, the 
external channel should be used to communicate bits that 
are corrupted by noise — more refined information about 
the state that is not already implicitly communicated by the 
noisy state itself. 

The resulting scheme for the problem over reals is illus- 
trated in Fig. [4] The encoder forces lower order bits of the 
state to zero, thereby truncating the binary expansion, or 
effectively quantizing the state into bins. The higher order 
bits that are corrupted by noise (63 , 64 in Fig. [3|a)) are com- 
municated via the external channel. These bits can be thought 
of as representing the color, i.e. the bin index, of quantization 
bins, where set of 2 flex consecutive quantization-bins are 
labelled with 2 R,!I colors with a fixed order (with zero, for 



instance, colored blue). The bin-index associated with the 
color of the bin is sent across the external channel. The 
decoder finds the quantization point nearest to j/2 that has the 
same bin-index as that received across the external channel. 

The scheme is very similar to the binning scheme used for 
Wyner-Ziv coding of a Gaussian source with side informa- 
tion [28], which is not surprising because of similarity of our 
problem with the Wyner-Ziv formulation. 
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Fig. 4. The strategy intuited from the semi-deterministic model naturally 
yields a binning-based strategy for reals that leads to a synergistic use of 
implicit and explicit communication. The external channel get the decoder 
the bin-index (in this example, the index is 1). The more significant bits 
(coarse bin) is received from the implicit channel. Effectively, use of the 
external channel increases the distance between the 'valid' codewords by a 
factor of 2 R '* . 



IV. Gaussian external channel 

A more realistic model of the external channel is a power 
constrained additive Gaussian noise channel, which was con- 
sidered in [25], [26]. Without loss of generality, we assume 
that the noise in the external channel is also of variance 1. 

At finite-lengths, an upper bound can be calculated using 
binning-based strategies. This binning-strategy turns out to 
outperform Martins's strategy by a factor that diverges to 
infinity. The key is to choose the set of problems where the 
initial state variance and the power on the external channel, 
denoted by P ex , are almost equal. In this case, a strategy that 
communicates the state on the external channel is not helpful 
— implicit channel can communicate the state at almost the 
same fidelity. Fig. [5] shows that fixing the relation P ex = (Tq, 
as CTq — > 00, the ratio of costs attained by the binning strategy 
to that attained by Martins's strategy diverges to infinity. 
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Fig. 5. If the SNR on the external channel is made to scale with SNR of 
the initial state, then our binning-based strategy outperforms strategy in [26] 
by a factor that diverges to infinity. 



V. Asymptotic and scalar versions of the 

PROBLEM 



A. Asymptotic version 



We now show that the binning strategy of Section III is 
approximately-optimal in the limit of infinite-lengths. 

Theorem 1: For the extension of Witsenhausen's coun- 
terexample with an external channel connecting the two 
controllers, 
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(To + v Pj and the upper bound is achieved by binning- 
based quantization strategies. Numerical evaluation shows 
that < 8. 

Proof: Lower bound 
We need the following lemma from [13, Lemma 3]. 

Lemma 1: For any three random vectors A, B and C, 



\B - C|| 2 ] > ^/E[\\A-CP}- y/E[\\A-B\\*\. 
Proof: See [13]. 
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> y E [||X- - U™|| 2 ] - ^E [||X™ - Xf|| 2 ]. (2) 

We wish to lower bound E [||X™ — U™||]. The second term 
on the RHS is smaller than \pmP. Therefore, it suffices to 
lower bound the first term on the RHS of (|2j. 

With what distortion can x™ be communicated to the 
decoder? The capacity of the parallel channel is the sum 
of the two capacities C sum = R ex + C im pi icit . The capacity 
C 'implicit is upper bounded by | log 2 (l + P) where P := 
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external channel. On the implicit channel, send the codeword 
closest to the vector x™. 

The decoder looks at the bin-index on the external channel, 
and keeps only the codewords that correspond to the bin 
index. This subset of the codebook, which now corresponds 
to the set of valid codewords, has rate CimpUdt- The required 
power P (which is the same as the distortion introduced in 
the source x ") is thus given by 
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Thus the distortion in reconstructing x™ is lower bounded 
by 
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The other strategies that complement this binning strategy 
are the analogs of zero-forcing and zero-input. 

Analog of the zero-forcing strategy The state x™ is 
quantized using a rate-distortion codebook of 2 ,nR " 1 points. 
The encoder sends the bin-index of the nearest quantization- 
point on the external channel. Instead of forcing the state all 
the way to zero, the input is used to force the state to the 
nearest quantization point. The required power is given by 
the distortion a^2~ 2R " x . The decoder knows exactly which 
quantization point was used, so the second stage cost is zero. 
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This proves the lower bound in Theorem [T] 
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Quantization: This strategy is used for Og > 1. Quantize 
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randomly into 2 nRcal bins, and send the bin index on the 



implicit • 



The total cost is therefore k <Jq2 
Analog of Zero-input strategy 

Case P. ctq < 4. 

Quantize the space of initial state realizations using a 
random codebook of rate R ex , with the codeword elements 
chosen i.i.d 7V(0,cr 2 (l - 2~ 2R -^)). Send the index of the 
nearest codeword on the external channel, and ignore the im- 
plicit channel. The asymptotic achieved distortion is given by 
Bin the codewords the distortion-rate function of the Gaussian source a 2 2- 2R '* . 



Case 2: R ex < 2. Do not use the external channel. Perform 
an MMSE operation at the decoder on the state x™. The 
resulting error is 2 ? , . 

Case 3: a 2 > 4, R ex > 2. 

Our proofs in this part follow those in [29]. Let R co de — 
Rex + \ log 2 (if") — e - ^ codebook of rate R co de is de- 
signed as follows. Each codeword is chosen randomly and 
uniformly inside a sphere centered at the origin and of radius 
my/ a 2 - D, where D = eft-™***' = 3 x 2~ 2 ( R ^- e \ This 
is the attained asymptotic distortion when the codebook is 
used to represent] x™. 

Distribute the 2 mRcode points randomly into 2 mi?cx bins 
that are indexed {1, 2, ... , 2 mR " w }. The encoder chooses the 
codeword x™ de that is closest to the initial state. It sends the 
bin-index (say i) of the codeword across the external channel. 

Let z Tode = x o" - x "ode- The received signal y™ = 
x™ + z m = x™ de + z™ de + z"\ which can be thought of 
as receiving a noisy version of codeword x™ rfe with a total 
noise of variance D + 1, since z™ de X z m . 

The decoder receives the bin-index i on the external 
channel. Its goal is to find x™ de . It looks for a codeword 
from bin-index i in a sphere of radius D + 1 + e around 
y™. We now show that it can find x™ de with probability 
converging to 1 as m — > oo. A rigorous proof that MMSE 
also converges to zero can be obtained along the lines of 
proof in [13]. 

To prove that the error probability converges to zero, con- 
sider the total number of codewords that lie in the decoding 
sphere. This, on average, is bounded by 
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Let us pick another codeword in the decoding sphere. Proba- 
bility that this codeword has index i is 2~ m - Rcx . Using union 
bound, the probability that there exists another codeword in 
the decoding sphere of index i is bounded by 
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It now suffices to show that the second term converges to 
zero as m — > oo. Since D = 3 x 2 -2 (- R ^~ c ). Since R ex > 2, 
e for small enough e. Since a 2 > 4, 
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uniform-distributed random-codebook and a Gaussian random-codebook of 
the same variance is the same [29]. 
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Thus the cost here is bounded by 3 x 2 2 ( R e* e ) which is 
bounded by 4 x 2~ 2i?ex for small enough e. 

1) Bounded ratios for the asymptotic problem: The up- 
per bound is the best of the vector-quantization bound, 
2k 2 2~ 2R ' sc , zero-forcing k 2 a 2 2~ 2R "* , and zero-input bounds 
of a 2 2- 2R ™ and 4 x 2~ 2R ^ . 

Case 1: P* > 

In this case, the lower bound is larger than k 2 2 |g— • Using 
the upper bound of 4 x 2~ 2flcx , the ratio is smaller than 64. 
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ratio is smaller than 25. 



using the upper bound of (Tq2 



2o-2_R e 



the 



If P* < 



< 



._.2o-2fle«> 
f7 Z 



> 



(o-o + VP 7 ) 2 + 1 
a 2 2~ 2 ^ 

fl + l) 2 - 



-2R C 



1 



25 
61 



<j 2 2 



-2it c 



Thus, a lower bound on MMSE, and hence also on the total 
costs, is 

2 

-2R ex w 0.19cr 2 2" 2fl ". 




2r>-2_R e 



the ratio is smaller than 



Using the upper bound of (Tq2 
019 < 6 - 

Numerical evaluations, shown in Fig. [6] show that the ratio 
is smaller than 8. ■ 

B. Scalar case 

We first derive a lower bound for finite- vector lengths. The 
obtained bounds are tighter than those in Theorem [T] and 
depend explicitly on the vector length m. 

Theorem 2 (Refined lower bound for finite-lengths): 

For a finite-dimensional vector version of the problem, if for 
a strategy the average power +Ex™ [||U™|| 2 ] = P, 




Fig. 6. The ratio of upper and lower bounds for the asymptotic problem 
are bounded by a factor of 8 for all k, ctq and R ex . 



the following lower bound holds on the second stage cost 
for any choice of Oq > 1 and L > 



where ri(P,a%,<r^,L) 



C m (L) 



exp 



ml 2 (a 2 , - 1) 




K 2 (P,<T§,4,i)-\^ 



where K2(P,o'q,o'q,L) 



ci (Ljei-^W (((to + v 7 / 5 ) 2 + ' 
Cm(£) : = Pr (||z"'P<mL^) = i 1 -ip{m,L^E)y\ 

, m _ Pr(||Z m+2 || 2 <mL 2 ) _ 1— 0(m+2,£y^) 
«m^J ■— p r (||Z™|| 2 <rnL 2 ) ~ 1— i/.(m,L\M) ' 

< d m (L) < 1, and ij){m,r) = Pr(||Z m || > r). Thus the 
following lower bound holds on the total cost 



J min (m,k 2 ,a 2 ) > hif fc 2 P + 7?(F,(7o,cr|,L) 



(4) 



for any choice of <Jq > 1 and L > (the choice can depend 
on P). Further, these bounds are at least as tight as those of 
Theorem [T] for all values of k and (Tq. 

Proof: We remark that the only difference in this lower 
bound as compared to that in [15] is the term for R ex in the 



expression for n 2 - The proof follows along the lines of that 
of [15, Theorem 3], See Appendix [I] for the proof. ■ 

Theorem 3 (Upper bound for scalar case): An upper 
bound on costs for the scalar case is given by 

J opt < min{infp> 2 -2*„ k 2 P + V(3, 2 2R "P), 
ck 2 a 2 ,ca 2 2-2R ex , 



-2R e 



+ (l + a) : 



Kl+ln(a 2 ))| 5 



wher^jc < 2.72 and ip(m,r) is defined in Theorem [2] 

Proof: Just as for the asymptotic case, each term in the 
upper bound corresponds to a certain strategy. 
Quantization 

Divide the real line into uniform quantization bins of size 
V^P. The quantization points are located at the center of 
these bins. Number consecutive bins i(mod 2 R "*) starting 
with bin which contains the origin. The encoder forces 
the initial state to the quantization point closest to the initial 
state, requiring a power of at most P. It also sends the index 
of the quantization bin on the external channel. 

The decoder looks at the bin-index, and finds the nearest 
quantization point corresponding to the particular bin-index. 
The resulting MMSE error is given by E 

This is shown to equal ip(3, 2 Re:c ^/P) in 
first term. 



z 2 l 



{| Z |>2«ex TP} 

15]. This yields the 



Analog of zero-forcing 

Quantize the real-line using a quantization codebook of rate 
R ex . The encoder forces xo to the nearest quantization point, 
and sends the index of the point to the decoder. The distortion 
is bounded by 2.72aQ2~ 2Rs:c [32]. The decoder has a perfect 
estimate of x\, thus the total cost is given by k 2 aj§l~' 1R * x . 

Analog of zero-input 

As for the asymptotic case, we break this case into two 
strategies. For a 2 < 4, we again use a quantization codebook 
of rate R ex , but instead of zero-forcing the state, we take the 
distortion hit at the decoder. The resulting cost is cctq2 _27?cx . 

For CTq > 4, we use a construct based on the idea of 
sending coarse information across the implicit channel, and 
fine information across the explicit channel. Divide the entire 
line into coarse quantization-bins of size 2a. Divide each bin 
into 2 Rf:c sub-bins, each of size 2a2~ Rcx . Number each of 
the sub-bins in any sub-bin from 0,1,..., 2 R ^ . 

The encoder send the index of the sub-bin in which xq lies 
across the external channel. The decoder decodes this sub-bin 
by finding the nearest sub-bin to the received output that has 
the same index as that received across the external channel. 

If the decoder decodes the correct sub-bin, the error is 
bounded by a 2 2 _2flcx . In the event when there is an error in 
decoding of the sub-bin, the error is bounded by (\z\ + a) 2 , 

7 This upper bound on c is the believed upper bound on the distortion-rate 
function D S (R) = otq2~ 2r of a scalar Gaussian source. We have been 
unable to find a rigorous proof of this result, although the result is known 
to holds at high rates [30], and Lloyd's empirical results [31, Table VIII] 
suggest that the bound holds for all rates. 



which averaged under the error event \z\ > a takes exactly 
the form of [15, Lemma 1]. Using that lemma, the MMSE 
in the error-event is bounded by 

( v / ^M + fl V / *M) 2 

"+|(l+ln(a 2 )) 



E[(\z\+a)\ lzl>a} ] < 

< (l + a)V 
Thus the total MMSE is bounded by 

MMSE < a 2 T 2R ^ + (1 + a ) 2 e -T + § (i+in(« 2 )). 




Fig. 7. Ratio of upper and lower bounds on the scalar problem for various 
values of R ex . The ratio diverges to infinity as R ex — > oo. 

Fig. [7] shows ratio of upper bound of Theorem [3] and lower 
bound of Theorem [2] in the (k, tro) -parameter space. Even 
though the ratio is bounded for each R ex , it blows up as 

R ex — > oo. 

VI. Discussions and conclusions 



The asymptotic result in Section V-A extends easily to 
an asymptotic version of problem with a Gaussian external 



channel (Section IV i. This is because the error probability on 
the external channel converges to zero as the vector length 
m — > oo for any R ex < C ex , the capacity of the external 
channel, making it behave like a fixed rate external channel. 
Using large-deviation techniques, there is hope that the scalar 
problem with Gaussian external channel may also be solved 
approximately. 

A rate-limited noiseless channel can be thought of as a 
model for limited-memory controllers. The problem of Fig. [2] 



can then be interpreted as a single controller system with 
finite memory. The problem problem considered here is also 
a toy problem that can design strategies for finite-memory 
controller problems. 
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Appendix I 

Proof of lower bound for finite-length problem 
Proof: From Theorem [T] for a given P, a lower bound 

on the average second stage cost is ((^/k^, 

We derive another lower bound that is equal to the expression 

for n(P,o%,<rl,,L). 

Define 5f := {z m : ||z m || 2 
to denote which probability model is being used for the 
second stage observation noise. Z denotes white Gaussian 
of variance 1 while G denotes white Gaussian of variance 



< mL 2 ^} and use subscripts 



4 7) (X£\Z m ) 



> f^esg 4 7) (x^z™)/ (x^)dx^) f z (z m )dz m 

gg^/ G (z m )dz m . (5) 
The ratio of the two probability density functions is given by 



2na 2 , 



fz(z m ) _ 

M*") (V^) m 

Observe that z m E Sf, ||z m || 2 < mh 2 a 2 G . Using a 2 , > 1, 
we obtain 



fz(z r ' 
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/c(z m ) 
Using (|5]) and |6j 
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J^(X^Z£)|Z£e.S£ Pr(Z£e<S«). (7) 



It is shown in [15] that 

Pr(Z™ G S%) 
From and ||S}, 
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Cm(i) 
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(1 — ip(m, Ly^rn)) 
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Xn 1 , ZJ? 
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(9) 



We now need the following lemma, which connects the new 
finite-length lower bound to the length-independent lower 
bound of Theorem Q] 



Lemma 2: 



> 



E 



X™,Z£ 



4 7) (X£\zs)|Z£g.s£ 



/ K2 (P,a 2 ,4,L)-yP 



for any i > 0. 

Proof: This is a reworking of the proof for the asymp- 
totic case to a channel which has a truncated Gaussian 
noise of (pre-truncation) variance cr 2 G and a truncation for 
\Zq\ < L. Details are omitted due to space constraints. The 
derivation follows exactly the lines of [15, Lemma 2]. ■ 

The lower bound on the total average cost now follows 
from ([9} and Lemma [2] ■ 
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